POSTECH at TREC 2009 Blog Track: Top Stories Identification

نویسندگان

Yeha Lee

Hun-Young Jung

Woosang Song

Jong-Hyeok Lee

چکیده

This paper describes our participation in the TREC 2009 Blog Track. Our system consists of the query likelihood component and the news headline prior component, based on the language model framework. For the query likelihood, we propose several approaches to estimate the query language model and the news headline language model. We also suggest two approaches to choose the 10 supporting relevant posts: Feed-Based Selection and Cluster-Based Selection. Furthermore, we propose two criteria to estimate the news headline prior for a given day. Experimental results show that using the prior significantly improves the performance of the task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

University of Lugano at TREC 2009 Blog Track

We report on the University of Lugano’s participation in the Blog track of TREC 2009. In particular we describe our system for performing blog distillation, faceted search and top stories identification.

متن کامل

From Blogs to News: Identifying Hot Topics in the Blogosphere

We describe the participation of the University of Amsterdam’s ILPS group in the blog track at TREC 2009. We focus on the top stories identification task, and take an approach that does not require the headlines of top stories to be known beforehand. We explore the feasibility of a so-called blogs to news approach: given a date and a set of blog posts, identify the main topics for that date. Th...

متن کامل

TREC 2010 Blog Track: Top Stories Identification

This paper describes our participation in the TREC 2010 Blog Track. For the Top Stories Identification Task, we explore the relationship among news events, news stories and blog posts. We first extract important news events from the TRC2 corpus using a probabilistic mixture model. Then, we propose a probabilistic approach to identify top news stories. Furthermore, we use an additional feature t...

متن کامل

ICTNET at Blog Track TREC 2009

This paper describes our participation in blog track of TREC2009. All runs are submitted for both two task, namely Top stories identification task and faceted blog distillation task. The “FirteX” platform was used to index and retrieval posts. As for top stories identification task, to identify important headlines, we measure the importance of headline by accumulating the BM25 relevance score w...

متن کامل

POSTECH at TREC 2009 Blog Track: Top Stories Identification

نویسندگان

چکیده

منابع مشابه

University of Lugano at TREC 2009 Blog Track

From Blogs to News: Identifying Hot Topics in the Blogosphere

TREC 2010 Blog Track: Top Stories Identification

ICTNET at Blog Track TREC 2009

Top Stories Identification From Blog to News in TREC 2010 Blog Track

عنوان ژورنال:

اشتراک گذاری